Distributed Joins and Data Placement for Minimal Network Traffic
نویسندگان
چکیده
منابع مشابه
Minimal Cost Reconfiguration of Data Placement in Storage Area Network
Video-on-Demand (VoD) services require frequent updates in file configuration on the storage subsystem, so as to keep up with the frequent changes in movie popularity. This defines a natural reconfiguration problem in which the goal is to minimize the cost of moving from one file configuration to another. The cost is incurred by file replications performed throughout the transition. The problem...
متن کاملAdaptive Data Placement for Distributed-memory Machines Adaptive Data Placement for Distributed-memory Machines 1
Programming distributed-memory machines requires careful placement of data on the nodes. This is because achieving eeciency requires balancing the computational load among the nodes and minimizing excess data movement between the nodes. Most current approaches to data placement require the programmer or compiler to place data initially and then possibly to move it explicitly during a computatio...
متن کاملApproximate Data Stream Joins in Distributed Systems
The emergence of applications producing continuous high-frequency data streams has brought forth a large body of research in the area of distributed stream processing. In presence of high volumes of data, efforts have primarily concentrated on providing approximate aggregate or top-k type results. Scalable solutions for providing answers to window join queries in distributed stream processing s...
متن کاملEntropy-based Consensus for Distributed Data Clustering
The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...
متن کاملDistributed Probabilistic Network Traffic Measurements
Measuring the per-flow traffic in large networks is very challenging due to the high performance requirements on the one hand, and due to the necessity to merge locally recorded data from multiple routers in order to obtain network-wide statistics on the other hand. The latter is nontrivial because traffic that traversed more than one measurement point must only be counted once, which requires ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Database Systems
سال: 2018
ISSN: 0362-5915,1557-4644
DOI: 10.1145/3241039